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Abstract 



A statistical test is presented to decide whether data are ade- 
quately described by probabilistic functions of finite state Markov 
chains ("hidden Markov models") as applied in the analysis of ion 
channel data. Particularly, the test can be used to decide whether a 
system obeys the Markov condition. Simulation studies are performed 
in order to investigate the sensitivity of the proposed test against vio- 
lations of the model assumptions. The test can be applied analogously 
to Markov models. 
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I. Introduction 



Ion channels are large proteins located in the membranes of cells. They serve 
for signal transmission and regulate the concentration of ions in the cell. 
The channels open and close in a stochastic manner dependent on external 
conditions like trans-membrane voltage difference, concentration of ligands or 
mechanical stress. In general the channels have several states in which they 
are closed, resp. open. They might even possess open states with different 
conductivity. The noisy current in the range of pA through single channels 
can be measured by the patch clamp technique QX[ . 

Analyzing data from ion channels generally relies on the assumption of a 
Markovian dynamics. This holds for infering the number of channel states 
and mean dwell times by fitting exponentials to dwell time histograms [fj, [J, 
for explicit modeling of low-pass filtered records by Markov models 0, [J and 
also for analyzing unfiltered records by hidden Markov models [|, |7|, [Sj |9j |10 
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In many cases, however, it is not evident from empirical data whether the 
system actually obeys the Markov condition. For two reasons, this assump- 
tion has given rise to a lively discussion ^2, 13|, [14[]. On the one hand, the 
information about the validity of this condition can provide valuable insight 
into the system under investigation |15 , |16| . On the other hand, conclusions 
drawn from a model which does not fit to the process that has produced the 
data are very likely to lead to erroneous results. Thus, it is desirable to test 
whether the process is adequately described by the selected (hidden) Markov 
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model. 

We propose a test to perform this task. It is based on the asymptotic 
distribution of the log-likelihood that holds if the model is valid. A deviation 
from the expected distribution provides a test for the model. In order to 
evaluate the sensitivity of the proposed test against a violation of the null 
hypothesis four simulation studies are performed where the assumption of 
an underlying hidden Markov model is violated in various manners. A fifth 
simulation study shows that the test is also useful to estimate the minimum 
number of states in a Markov model needed to be compatible with the data. 

The paper is organized as follows: In the next section, we briefly review 
the hidden Markov model. In Section III the test statistic is introduced. 
The power of test is evaluated by simulation studies in the Section IV. As 
presented, the test applies to hidden Markov models, however, it can be 
applied analogously to Markov models. 

II. Hidden Markov Models 

Hidden Markov models (HMM), introduced by (DJ and used in diverse fields 
like speech recognition and ion channel analysis, are generalizations of 
Markov models that allow to include observational noise. HMMs can be 
formulated in continuous-time and discrete-time versions. Following we 
chose the latter. The results also hold for continuous-time models. 

A stationary hidden Markov model is given by an unobservable process 
X t which can take one of s states for every point t in time. The probabilities 
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for a change from a state i to a state j are described by a time independent 
transition matrix (a y ) — 1, s). Since each row of the matrix is 
normalized to unity, the s x s matrix (a y -) has s(s — 1) free parameters. 
The observations Y t are determined by the output probability densities of 
each of the s states. These densities are described by parameter vectors </>j 
(i — 1, . . . , s). For example, the density functions f(y, can be given by 
Gaussian distributions with different means and variances. 

For ease of notation, the parameters of the hidden Markov model are 
arranged in a single parameter vector 6. Its dimension is denoted by r. For 
example, in the case of s states with Gaussian output probabilities the model 
has r = s(s — 1) + 2s = s 2 + s parameters. 

Given an observed time series Yi.jv = YJ., Yjv of length N, the 

parameter vector 6 can be estimated by a maximum likelihood procedure 



[ |T9| , [20| , |21|j . For the calculation of the log-likelihood function 

Ljv(Y"i..jv,0) = log P\Yi..jf\0] (1) 

the so called forward probabilities to find the system in state i at time t given 
the data up to time t are defined by : 

a i (t,Y 1 ... t ,O) = P[X t = i,Y l ... t \0] . (2) 

They can be calculated using the recursion: 

Oi(t, Yi___ t , 0) = £ OL S {t - 1, Y M , 6) a %] {0)f(Y t , fy) (3) 
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and lead to the log-likelihood function by: 

L N (Y 1 ... N \e) = \ogj2^(N,Y 1 ... N ,e) . (4) 

1=1 

An estimate 0n can be obtained by maximizing Ln(Yi...n\0) with respect 
to either by nonlinear optimization or by the Expectation - Maximization 
algorithm, i.e. the Baum - Welsh reestimation formulae [|T^, E3. Here, all 
numerical calculations have been performed by the latter method as described 
in 



p0| since it behaves numerically more stable than nonlinear optimization. 
For ease of notation we suppress the dependence of L n (Yi...n\0) on Y"i...jv 
in the following. 

III. The Test Statistic 

In this section we introduce the statistic to test the adequacy of a given 
hidden Markov model to describe an observed time series. 

Under mild regularity conditions, the difference between the maximum 
likelihood estimators 0n and the true parameters Oq are generally believed 
due to central limit theorems to converge to a normal distribution 

VN(0 o -0 N )~fS(O,Z) (5) 

with asymptotically : 

°~ L N (0 N ) - -i-Er. 1 . (6) 



This has been proven for independent random variables (see e.g. ||23|| ) 



Markov models [24] and hidden Markov models with discrete output proba- 



bilities [I7[|. For hidden Markov models with continuous output probabilities, 



up to now, the consistency of the maximum likelihood estimators ||25|| , the lo- 



cal asymptotic normality in the sense of Le Cam |26|, and the asymptotic 



normality of maximum split data likelihood estimators has been shown [£S 
The proof of asymptotic normality of the maximum likelihood estimators in 
hidden Markov models is announced |29 



Given the asymptotic normality of the estimators of Eqs. @, the distri- 
bution of the maximum log-likelihood L^{9^) that is itself a random variable 
can be derived by a Taylor expansion (see [|^| for a detailed discussion) : 

d 

Ln(6o) = L N {6 N ) + ——L N (9 N )(9 — 9 N ) + 

\{o Q - e N )^^-L N (e N )(e - o N ) + o(\e - e N \ 3 ) . (7) 

The second term on the right hand side vanishes due to the estimation proce- 
dure. Neglecting higher order terms, solving for 2{L{9^) — L(9 )) and using 
Eqs. (ID and @ yields : 

2 [L N (d N ) - L N {0 )) ~ xl ■ (8) 

This relation holds asymptotically if the model is specified correctly. The 
number N of data needed to reach the asymptotic regime depends on the 
process. Simulation studies not presented here show that Eq. (||) holds if 
each transition between the states has occurred at least 10 times. 

For the test we estimate #o based on the whole time series of length N 
and denote this estimate by 9^. Then, the time series is divided in K parts 
of length M = N/ K. For each the these parts we estimate the parameters 
9m and evaluate the log-likelihoods L m {9m) and L M {9 N ). Asymptotically, 



i.e. for N —> oo, M -> oo, but M/N -> 0, the distribution of 2 [L M (0 M ) - 
L M (0 N )] is given by: 

2{L M {b M )-L M {b N )\ . (9) 

By the proposed procedure we obtain samples of the Xr distribution if 
the model is valid. In order to judge whether Eq. (|) holds, we apply the 
Kolmogorov - Smirnov - test for the consistency of an empirical distribution 
with a proposed theoretical distribution ||30|| . The Kolmogorov - Smirnov - 
test statistic is denoted by Z in the following. 

IV. Evaluation of the power of the test 

In this section, we evaluate the power of the above proposed test, i.e. we 
investigate the sensitivity of the test against a violation of the null hypothesis 
that the data were produced by a hidden Markov model. Of course, it is 
not possible to consider all imaginable alternative hypotheses. One has to 
restrict oneself to a reasonable class of alternative hypotheses. We choose 
four alternative hypotheses that violate the model assumptions: 

• Nonstationary transition probabilities 

• Dwell time dependent transition probabilities 

• A fractal model 

• Refractory time 
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Finally, we show that the proposed test enables to estimate the smallest 
number of states of the Markov process compatible with the data. 



A. Model definition 

In order to evaluate the power of the test numerically we chose a hidden 
Markov model with three states and Gaussian output probability functions 
representing e.g. one ion channel with two different conductance levels. The 
transition matrix A is given by: 

/ 0.90 0.05 0.05 \ 
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(10) 



0.06 0.92 0.02 
V 0.03 0.02 0.95 / 

The means and the variances of the gaussian output probability functions 

were chosen to be: 





= 0.0 


-I 


= 0.1 




// 2 


= 1.0 




= 0.1 


(11) 
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= 2.0 
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= 0.1 





The dimension r of the parameter vector 6 is 12. We simulated time series 
of length N = 150.000 and divided it into K = 150 time series of length 
M = 1000 to perform the test. To apply the test the resulting time series 
must be long enough for the asymptotic results to be valid. If the off - 
diagonal elements of the transition matrix are of similar magnitude, as a rule 
of thumb, this condition is met, if the time series have a length of at least : 

M = 10 sr max (12) 

with s the number of states and r max the largest dwell time. For the chosen 

model, the dwell times are 10, 12.5, resp. 20 units of time. please 

locate Fig. [L] 
^ around here 



Fig. [1] shows the expected cumulative X12 distribution according to Eq. (|]) 
and the empirical cumulative distribution for the chosen process. It indicates 
a good qualitative agreement of the two distributions. In order to quantify 
this, we counted for 200 realizations of the process the number of cases where 
the hypothesis of consistency of the two distributions were rejected by the 
Kolmogorov - Smirnov - test at a significance level of 5 %. This results 
in actual rejection rate of 4.5 %, indicating that the asymptotic regime is 
reached for the chosen situation. 

B. Power of the test 

To investigate the power of a test, usually, for different degrees of violation 
of the null hypothesis in the order of thousand time series are realized, the 
test is performed and the fraction of rejected null hypothesis given a certain 
significance level a is calculated in dependence of the degree of violation. 
However, this procedure to evaluate the power requires an enormous com- 
putational effort to obtain a good approximation of the underlying smooth 
behavior since for the chosen model and number of data the maximization 
of the log-likelihood for a single time series requires ca. 45 min. on an IBM 
6000 RISC workstation. Therefore, we choose another way to display the 
power of the test. Instead of counting the simulation runs with rejected null 
hypothesis we average the test statistic of 10 realizations for each degree 
of violation of the null hypothesis to approximate the smooth curve. This 
procedure estimates the mean of the distribution of the test statistic for the 
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alternative hypotheses. Simulation studies show that these distributions of 
the test statistic are symmetric and that their variance is rather constant. 
Therefore, the mean calculated here corresponds to the median of the dis- 
tributions and is related monotonically to the fraction calculated usually. 
Thus, this procedure yields essentially the same information as the canonical 
method but requires only one per cent of computational effort. 

We now discuss the different simulation studies to evaluate the power of 
the proposed test. 

• Nonstationary transition probabilities 

In order to investigate the sensitivity against violations of the station- 
arity assumption, nonstationarity of the transition probability of the 
first state is introduced by : 

~ / n (s — l)vt „ s 

5n(t) = — (13) 

axj(t) = axj + ^r, (j = 2,...,s) , (14) 

where s again denotes the number of states. This time dependency of 
the transition probabilities causes a decreasing dwell time of the first 
state. The drift rate v serves as the parameter for the null hypothesis 
violation. As outlined above, we judge the performance of the test by please 
averaging the test statistic of ten simulations for every degree of the ^ oca ^ e Fig- 
null hypothesis violation. Fig. |2] shows the averaged test statistic Z of 
the Kolmogorov - Smirnov - test with increasing violation of the null 



around here 
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hypothesis and the 1%, resp. 0.1% levels of significance. A change of 
10% over the whole observation time in the dwell probability of one of 
three states is detectable by the proposed test. 

Dwell time dependent transition probabilities 

The Markov condition, stating that the transition probabilities between 
the states do not depend on the time already spent in the states is 
violated by increasing the probability to leave any state proportional 
to the time U n already spent in the state. The proportionality constant 
7 parameterizes the violation of the null hypothesis. 

au(t in ) = a u - (s - 1) "/t in (15) 
aij{t in ) = aij + lUn, (i^j) ( 16 ) 

Fig. ^| shows the result of the simulation. A change of more than one please 

per cent per time step for the dwell probabilities leads to the rejec- ^ oca ^ e Fig- 

around here 

tion of the hypothesis that the time series was generated by a Markov 
process. During the simulation, it was controlled that the condition 
< aij(t in ) < 1 was not violated. 

A fractal model 

Another possibility to violate the Markov condition is given by the 
fractal models |15|]. For these models, the dwell probability increases 



with the time t« n already spent in the state. The transition probabilities 
of a fractal model are given by: 

5ft (tin) = 1 - (1 - an) t\~ D (17) 
13 



aij{Un) = aijt\ n D i^J , (18) 

where D is the fractal dimension which parameterizes the violation of please 

the Markov condition. For D = 1 the Markov model results. The result ^ oca ^ e Fig- f| 

around here 

of the simulation in Fig. [| reveals that for the given model a fractal 
dimension of e.g. 1.1 will lead to a rejection of a Markovian process. 
On the other hand, a dimension larger than 1.1 can be excluded if the 
test does not rejected the model. 

• Refractory time 

Finally, the Markov condition is violated by introducing a refractory 

time, i.e. a minimal time that the process has to spend in a state. To 

simulate such processes we used the model according to Eq. (0) but 

forced the state to stay for the time r re f before the dynamics were 

applied. Fig. |5] displays the results. Note that the dwell times of please 

the chosen model were 10, 12.5, resp. 20 units of time, so that the ^ oca ^ e Fig- d 

around here 

considered type of violation is only detectable if it amounts to 50% of 
the shortest dwell time. 

In summary, the test enables a detection of different types of violations of 
the Markov condition. 

C. Estimating the minimum number of states 

So far, the number of states s of the Markov process was assumed to be 
known. Since the number of assumed states s determines the degrees of 
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freedom r of the model, the proposed test can be applied to infer the number 

of states of the process under investigation. This is done by comparing the 

left hand side of Eq. (§) with the x? distribution with degrees of freedom f 

corresponding to the assumed model, e.g. in the case of a Gaussian model: 

f = s 2 + s. Fig. ^ displays the results. Hidden Markov models with increasing please 

number of states are fitted to data from the model Eq. ( |10|) with three states. ^ oca ^ e Fig- I 

around here 

The test enables a determination of the (correct) smallest number of states 
that can describe the process. Note that models with more than three states 
are also detected as being consistent with the data. 

V. Discussion 

Markov and hidden Markov models are increasingly used in the analysis of 
patch clamp ion channel data. In many cases their adequacy for a given 
system has been assumed, but not tested using empirical data. If a record is 
a realization of a hidden Markov process, the asymptotic distribution of the 
log-likelihood is a \ 2 T distribution, its number of degrees of freedom r being 
given by the number of model parameters. Thus, a test for the consistency of 
the empirical distribution of a fitted model with the theoretical distribution 
provides a test whether the time series may be considered as a realization of 
a hidden Markov process. 

Based on the asymptotic distribution of the log-likelihood, we have in- 
troduced such a test. The test is analogously applicable to test Markov 
models. In order to investigate how sensitive the test is to detect a viola- 
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tion of the assumed model, we performed four simulation studies where we 
modified a hidden Markov process continuously in different ways to become 
nonmarkovian. The sensitivity of the proposed test depends on how the 
model assumption of a stationary Markov process is violated: The test has 
shown to be very sensitive if the violation results from drifting transition 
probabilities, from dwell time dependent transition probabilities or a fractal 
model. For example, a fractal dimension of 1.7 as reported in |T5| would lead 
to a highly significant rejection of the Markov model used in the simulation 
study. The test is less sensitive to detect refractory times in the system that 
retard the beginning of the Markovian dynamics. 

Furthermore, the proposed test can be used to estimate the minimum 
number of states in the Markov process necessary to describe the data. 

In applications, performing simulation studies as presented will reveal 
which degree of violation of the model assumptions is consistent with the 
fitted model and which degrees of violation can be excluded if the model can 
not be rejected. 

The test is suited for analyzing data recorded under steady state condi- 
tions like in the case of ligand dependent ion channels. For voltage dependent 
channels where numerous trials for a certain pulse protocol are recorded these 
single trials determine the length M in the proposed test. Further, it can 
help to decide whether observed changes in inactivation dynamics [pi 
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consistent with statistical fluctuations in a fitted model or have to be treated 
explicitly as modal gating between two different dynamics. 



16 



Acknowledgements 

We would like to thank A. Wilts and U.-P. Hansen for valuable comments 
on an earlier version of this manuscript. 



17 



References 

[1] E. Neher and B. Sakmann, Nature 260, 799 (1976). 

[2] S. Kijima and H. Kijima, J. Thee, Biol. 128, 423 (1987). 

[3] D. Colquhoun and F.J. Sigworth, in Single channel recording, edited 
by B. Sakmann and E. Neher, Plenum Press, New York, (1995), p. 483. 

[4] R. Horn and K. Lange, Biophys. J. 43, 207 (1983). 

[5] R. Horn and C. A. Vandenberg, Statistical properties of single sodium 
channels. J. Gen. Physiol. 84, 505 (1984). 

[6] S. H. Chung, J. B. Moore, L. Xia and L. S. Premkumar P. W. Gage, 
Phil. Trans. Roy. Soc. Lond. B 329, 265 (1990). 

[7] S. H. Chung, V. Krishnamurthy, and J.B. Moore, Phil. Trans. R. Soc. 
Lond. B 334 357, (1991). 

[8] D. R. Fredkin, and J. A. Rice, Proc. R. Soc. Lond. B. 249 125, (1992). 

[9] D. R. Fredkin, and J. A. Rice, Biometrics 48 427, (1992) 

[10] J. D. Becker, J. Honerkamp, J. Hirsch, E. Schlatter and R. Greger, 
Pflug. Arch. 224 328, (1994). 

[11] A. Albertsen and U. P. Hansen, Biophys. J. 67 1393, (1994). 

[12] S. J. Korn and R. Horn, Biophys. J. 54, 871 (1988). 

18 



[13] O.B. McManus, Spivak C.E., Blatz A.L., Weiss D.S. and K.L. Magleby, 
Biophys. J. 55, 383 (1989). 

[14] L.S. Liebovitch, Biophys. J. 55, 373 (1989). 

[15] L.S. Liebovitch, Fischbarg J. and J. P. Koniarek, Math. Biosci. 84, 37 
(1987). 

[16] P. Sansom, F.G. Ball, C.J. Kerry, R. McGee, R.L. Ransey, PN.P Ush- 
erwood, Biophys. J. 56, 1229 (1989). 

[17] L. E. Baum and T. Petrie, Ann. Math. Stat. 40, 1554 (1966). 

[18] L.R. Rabiner, IEEE Trans. Inf. Theory 37, 257 (1991). 

[19] L. E. Baum, T. Petrie, G. Soules and N. Weiss, Ann. Math. Stat. 41, 
164 (1970). 

[20] S. E. Levinson, L. R. Rabiner and M. M. Sondhi, Bell Sys. Tech. J. 62, 
1035 (1983). 

[21] W. Qian and Titterington D.M. Phil, Trans. R. Soc. Lond. A 377, 407 
(1991). 

[22] A. P. Dempster, N. M. Laird and D. R. Rubin, J. Roy. Stat. Soc. B 
39, 1 (1977). 

[23] D.R. Cox and D.V. Hinkley, Theorectical Statistics. Chapman and Hall 
(1974) 

19 



[24] P. Billingsley, Statistical Inference for Markov Processes. University of 
Chicago Press (1961). 

[25] B. G. Leroux, Stoch. Process. Applic. 40, 127 (1992). 

[26] L. LeCam amd G. Yang, Asymptotics in Statistics. Some basic Con- 
cepts. Springer. New York (1990) 

[27] P.J. Bickel and Y. Ritov, preprint 

[28] T. Ryden, Ann. Stat. 22, 1884 (1994). 

[29] T. Ryden, personal communication 

[30] L. Sachs, Applied Statistics. Springer (1984). 

[31] C.C. Cannon and S. M. Strittmatter, Neuron 10, 317 (1993). 



20 



1 Figure Captions 

Fig. 1: The empirical cumulative distribution of 2 [L M {6 M ) — L M {6 N )] 
(solid line) and the expected cumulative Xu distribution (dotted line) 
for the process defined by Eqs. (110) and fllTI). 

Fig. 2: The effect of drifting transition probabilities. Shown is the averaged 
test statistic Z of the Kolmogorov - Smirnov test for increasing drift 
rates v. The 1% and the 0.1% significance levels are marked. 

Fig. 3: The effect of dwell time dependent transition probabilities. Shown 
is the averaged test statistic Z for increasing degrees 7 of the null hy- 
pothesis violation. The 1% and the 0.1% significance levels are marked. 

Fig. 4: Violation of the Markov condition by a fractal model. Shown is the 
averaged test statistic Z for increasing fractal dimension D. 

Fig. 5: The effect of refractory time. Shown is the averaged test statistic Z 
for increasing refractory times r re f . 

Fig. 6: Determining the number of states. Shown is averaged test statistic 
Z for hidden Markov models with different number of states s applied 
to time series that were generated by a Hidden Markov Model with 
three states. The 1% and the 0.1% significance levels are marked. 
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Figure 2: 



23 




24 




Figure 4: 
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